|
Chinese speech segmentation into syllables based on energies in different times and frequencies
ZHANG Yang, ZHAO Xiaoqun, WANG Digang
Journal of Computer Applications
2016, 36 (11):
3222-3228.
DOI: 10.11772/j.issn.1001-9081.2016.11.3222
Precise speech segmentation methods, which can also greatly improve the efficiency of corpus annotation works, are helpful in comparing voice with voice models in speech recognition. A new Chinese speech segmentation into syllables based on the feature of time-frequency-dimensional energy was proposed:firstly, silence frames were searched in traditional way; secondly, unvoiced frames were sought using the difference of energies in different frequencies; thirdly, the voiced frames and speech frames were looked for with the help of 0-1 energies in special frequency ranges; finally, syllable positions were given depending on the judgements above. The experimental results show that the proposed method whose syllable error is 0.0297 s and syllable deviation is 7.93% is superior to Merging-Based Syllable Detection Automaton (MBSDA) and method of Gauss fitting.
Reference |
Related Articles |
Metrics
|
|